# Large-scale Visual Encoding
Siglip2 Giant Opt Patch16 256
Apache-2.0
SigLIP 2 is an advanced vision-language model that integrates multiple technologies to enhance semantic understanding, localization, and dense feature extraction capabilities.
Text-to-Image
Transformers

S
google
3,936
1
Aimv2 3b Patch14 224.apple Pt
AIM-v2 is an efficient image encoder model compatible with the timm framework, suitable for computer vision tasks.
Image Classification
Transformers

A
timm
50
0
Aimv2 Huge Patch14 224
The AIMv2 series are vision models pretrained with multimodal autoregressive objectives, demonstrating excellent performance across multiple benchmarks.
Image Classification
A
apple
54
9
Featured Recommended AI Models